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1.  SUMMARY 

This  report  covers  our  work  for  the  period  12/15/80  — 
12/15/81  under  contract  number  DAJA37-81-C-0065.  During  this 
period  our  efforts  were  directed  toward  the  following  tasks: 

(a)  Survey  and  organize  a  list  of  analytic  feature  evaluation 
functions  for  use  in  information  acquisition  tasks. 

(b)  Formulate  military  situation  assessment  tasks  as 
hierarchical  multiperspective  pattern  classification 
problems*  and  develop  an  approach  for  a  decision  support 
system  for  situation  assessment. 

(c)  Design.  develop.  and  test  the  INFDACQ  software  by  which 
computer-based  simulation  of  behavioral  and  analytic 
information  acquisition  strategies  may  be  investigated. 

(d)  Design.  develop,  and  test  the  INFOACG.  EXP  software  which 
tracks  information  acquisition  strategies  employed  by 
human  decision  makers  in  sequential  classification  tasks. 

Section  2  lists  three  publications  which  cover  tasks  (a) 
and  (b).  All  three  of  them  were  submitted  as  part  of  our 
earlier  progress  reports.  Section  3  describes  our  work  under 
task  id).  and  Appendix  I  describes  our  work  under  task  (c>. 
This  appendix  contains  a  major  part  of  a  new  article  which  is 
currently  under  preparation. 
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2.  PUBLICATIONS 

1.  Ben-Bassat  M.  Use  of  distance  measures.  information 
measures.  and  error  bounds  in  feature  evaluation.  In 
Krishnaiah  and  Kanal  L.  N.  (Eds.  >  The  Handbook  of  Statistics. 
Vol  II.  North  Holland  Publishers.  1981. 

2.  Ben-Bassat  M. .  Shaket  E.  and  Freedy  A.  Research  into  an 
intelligent  decision  support  system  for  (military)  situation 
assessment.  DSS-B1  Transactions.  pp.  143-151.  First 
International  Conference  on  Decision  Support  Systems. 
Atlanta.  Georgia.  June  1981. 

3.  Ben-Bassat  M.  and  Freedy  A.  Knowledge  requirements  and 
management  in  expert  decision  support  systems  for 
(military)  situation  assessment.  IEEE  Trans.  on  Systems  Man 
and  Cybernetics.  1982  (In  Press). 

3.  HYPOTHESES  CONCERNING  INFORMATION  ACQUISITION 

Our  hypothesis  states:  During  sequential  information 
acquisition,  a  decision  maker  (DM)  tends  to  concentrate  on  a 
limited  aspect  of  the  problem.  In  pattern  classif ication 
problems,  such  behavior  is  manifested  by: 

t 

(a)  When  the  number  of  classes  is  larger  than  a  certain 
threshold  (3  -.5).  DM  acquires  information  directed 
toward  the  ver if ication/el imination  of  a  subset  of  the 
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complete  set  of  classes. 

(b)  When  the  number  of  available  sources  of  information  is 
larger  than  a  certain  threshold  (5  -  7).  DM  evaluates 
only  a  subset  from  which  he  selects  the  best  one. 

(c)  When  the  number  of  classes  and  the  number  of  features  is 
large*  DM  focuses  his  attention  on  a  subset  of  the 
class/feature  matrix. 

The  first  step  in  our  plan  is  tt>  test  these  hypotheses. 
The  data  which  will  be  obtained  from  the  experiments  to  test 
these  hypotheses  will  be  used  to  formulate  additional 
hypotheses  concerning  behavioral  strategies  for  selecting  the 
subsets  of  classes  and  features.  Discovering  such  strategies 
will  greatly  assist  designers  of  decision  support  systems  in 
creating  better  human-oriented  systems.  (See  our  original 
proposal*  January  1979). 

The  basic  hypotheses  will  be  tested  by  interactive  sessions 
in  which  subjects  will  be  requested  to  solve  a  situation 
assessment  problem  which  is  formulated  as  a  sequential  Bayesian 
diagnosis  problem.  At  any  given  stage*  the  subject  will  have 
access  to  any  component  of  the  problem  including  current 
probabilities  *  of  the  possible  classes  and  the  conditional 
probabilities  of  the  features  (information  scources)  which  have 
not  yet  been  tested.  By  tracing  the  information  and  decision 
aids  that  he  requests*  we  will  be  able  to  confirm  or  disconfirm 
our  basic  hypotheses*  .and  will  attempt  '  to  discover  his 
information  acquisition  strategy(ies)  for  solving  the  situation 


assessment  problem. 

Pour  groups*  each  consisting  of  10  subjects*  will  take  part 
in  the  experiments.  In  each  group  the  order  of  classes  -and 
hence  the  order  of  prior  probabilities-  will  be  selected 
randomly.  This  will  permit  identifying  effects  which  are  not 
related  to  the  basic  hypothesis*  but  rather  to  other  factors 
such  as  the  order  in  which  the  classes  are  displayed  or  the 
location  of  high  and  low  probabilities. 

The  problem  will  he  presented  to  the  subject  as  a  medical 
diagnosis  problem  in  which  he  plays  the  doctor's  role.  No 
knowledge  of  Bayesian  statistics  is  required.  however,  since  the 
probability  updates  will  be  performed  by  the  computer.  The 
subject's  stain  task  is  to  inform  the  system  whether  he  wishes 
inforewtioO  for  the  entire  problem  or  for  a  limited  aspect  of 
it.  The  subject's  specific  tasks  will  vary  over  the  various 
experiments.  Each  subject  will  be  requested  to  -solve  five 
problems. 

A  tape  recorder  will  be  used  to  present  the  problem 
and  technical  instructions  to  each  group  of  subjects.  During 
the  experiments*  only  technical  questions  related  to  the 
operation  of  the  software  will  be  asnwered.  A  subject  who  will 
not  cosiprehend  the  situation  assessment  problem  will  be 
eliminated. 

The  key  'data  which  will  be  collected  at  each  stage  of  the 
sequential  situation  ass*  iment  >oce«s  consists  of  the 
following; 
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1)  Classes  being  considered. 

2)  Posterior  probabilities  requested  for  display. 

3)  Entropy  of  posterior  probabilities 

4)  Probability  of  error  if  a  decision  is  made  at  this  stage. 

5)  Time  between  consecutive  user's  inputs. 

For  the  overall  process.  we  will  consider: 

1)  Average  and  standard  deviation  of  the  number  of  steps 
before  a  final  classification  is  made. 

2)  Average  and  standard  deviation  of  the  number  of  classes 
(and/or  features)  considered  at  each  stage. 

3)  Average  and  standard  deviation  of  the  sum  of  probabilities 
over  the  selected  classes. 

4)  Characteristics  of  classes  in  selected  subsets. 

In  addition.  each  subject  will  be  requested  to  verbaliie 
the  reasons  for  each  of  his  decisions.  These  protocols  will  be 
analyxed  to  gain  better  insight  into  the  subject's  strategy. 

A  special  version  of  the  1NFDACQ  software  was  implemented 
to  examine  behavioral  information  acquisition  strategies.  Named 
INFOACQ.  EXP.  and  written  in  FORTRAN,  this  software  operates  on  a 
microcomputer  PDT  11/151  made  by  Digital  under  RT11  operating 
system.  (Such  a  system,  with  64K  core  memory.  two  diskette 
drives  each  for  256K  bytes.  and  VT100  terminal,  costs  today 
•5000).  This  software  is  now  being  used  to  run  the  experiments 
described  in  section.  3.  The  present  version  interacts  with  the 
user  el'ther  in  Hebrew  or  in  English. 

Many  of  the  experiments  have  already  been  performed  and  we 
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1.  INTRODUCTION 


Pattern  classification  constitutes  a  major  component  of  a 
wide  variety  of  decision  making  problems  in  military.  medicine< 
management  and  other  areas.  These  include  battlefield  reading, 
target  detection,  situation  assessment.  medical  diagnosis,  and 
the  recognition  of  management  and  mismanagement  styles.  Real 
time  classif ication  proceeds  in  a  sequential  manner. 
Features ( i. e.  attributes.  characteristics)  are  observed  one  or 
more  at  a  time,  the  information  gain  is  assessed  by  modifying 
the  probabilities  of  the  relevant  alternatives  (the  classes), 
and  a  decision  whether  testing  is  to  be  continued  or  terminated 
is  made.  If  a  final  decision  cannot  be  made,  the  next  feature  is 
selected  for  testing. 

A  key  module  in  a  decision  support  system  for  pattern 
classification  tasks  generates  recommendations  for  the  next 
feature(s)  to  be  tested  in  order  to  converge  effectively  to  a 
final  decision.  Algorithms  for  such  a  module  have  been  widely 
proposed  in  the  literature  (see  Section  2).  The  recommendations 
suggested  by  such  algorithms,  however,  are  quite  frequently,  and 
particularly  for  problems  with  tens  of  classes  and  features,  far 
from  being  natural.  That  is.  the  sequence  of  feature  testing 
proposed  by  the  system  seems  weird  to  a  human  decision  maker 
working  in  this  field  who  fails  to  see  the  logic  behind  it.  The 
purpose  of  this  paper  }s  to  identify  some  of  the  reasons  for 
this  lack  of  naturalness  of  the  existing  algorithms,  to  proposa 
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modified  human— or iented  algorithms  and  to  evaluate  their 
effectiveness  relative  to  the  existing  algorithms. 

2.  PROBLEM  FORMULATION 

The  basic  pattern  classification  problem  is  concerned  with 
the  assignment  of  a  given  object  to  one  of  m  known  classes. 
Adopting  the  Bayesian  approach,  the  true  class  is  considered  as 
a  random  variable  C  taking  values  in  the  set  <1,2, ...m).  where 
C“i  represents  class  i.  The  initial  uncertainty  regarding  the 
true  class  is  expressed  by  the  prior  probability  vector 


A  -  (•)»  a2*  am^ 


where  a,>  0  la,  ■  1.  This  uncertainty  can 
•  » ,  I 


be  modified  by  observing  features  (i.e. 


attributes, 


characteristics)  of  the  given  object.  Let  Xj,  denote  a  feature 
j  and  let  Pj(Xj)  denote  the  conditional  probability  function  of 
feature  j  under  class  i  for  the  value  Xj  ■  Xj  .  Once 
X, ,  X,,  ...X  are  observed,  the  prior  probability  of  class  i  is 
replaced  by  its  posterior  probability  which  is  given  by  Bayes' 
theorem: 


*  I  ( X 1 ,  x2 . xn) 


aiPi  (Xf 


£  a.  P.  (x. , •  •  »x  ) 
k«l  x  *  1  " 


0) 


EJLtiaaix.  A 

Consider  the  pattern  classif ication  problem  faced  by  an 
intelligent  officer  in  situation  assessment  tasks.  The 
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aggressor'*  course  of  action  may  be  classified  into  seven 
classes  such  as  various  types  of  attack.  defend,  delay  or 
withdraw.  To  predict  the  aggressor's  intention,  the  intelligent 
officer  collects  features  (indicators)  regarding  the  aggressor's 
activities.  For  instance. Xj  «  Extensive  artillery  preparation. 

”  Increased  activity  in  rear  areas.  and  so  on. 
These  features  are  binary  features  which  may  attain  only  two 
possible  values.  say  0  for  negative  and  1  for  positive.  The 
probability  for  a  positive  and  a  negative  response  for  each 

feature  changes  under  each  course  of  action.  Table  1  shows  an 

« 

example  with  seven  classes  and  six  features.  The  entries  of  the 
table  represent  the  respective  conditional  probabilities  for  a 
positive  response.  If.  for  instance.  is  observed  to  be 
positive,  the  prior  probabilities  change  from  .(0.30.  0.25. 
0.15,  0.15,  0.07.  0.07,  0.01)  to  (0.12,  0.26.  0.24.  0.16,  0.17, 
0.02.  0.03). 

In  sequential  classification,  *. g. ,  Fu(1968),  the  features 
are  tested  one  at  a  time,  the  posterior  probabilities  arc 
computed,  and  a  decision  whether  testing  is  to  be  continued  or 
terminated  is  made.  If  testing  is  to  be  continued,  the  next 
feature  is  then  selected  for  testing.  Otherwise,  a 
classification  decision  is  made.  When  all  types  of  errors  arc 
of  equal  cost  and  all  types  of  correct  decisions  are  of  equal 
importance,  the  optimal  Bayes  decision  rule  assigns  the  pattern 
to  thy  class  with  .the  highest  a  posteriori  probability,  and  the 
Bayes  risk  reduces  to  the  probability  of  error.  Hence,  if.  at 


IS 


the  end  of  stage  j.  j>  1.  a  classification  decision  is  made.  the 
resulting  probability  of  error  p  ,  is  given  by: 


P 

e 


1  -{max  a| ^  ,a^  , . . .  ,a^  ) 


(2> 


where  »j  is  the  probability  of  class  i  at  the  end 

of  stage  j. 

Dynamic  programming  formulation  of  the  sequential 
classification  problem  provides*  in  principle*  a  method  for 
obtaining  an  optimal  strategy  regarding  the  stopping  decision 
and  the  feature  selection  decision*  e. g. •  Cardilo  and  Fu  (1967). 
Computationally*  however*  dynamic  progressing  procedures  are 
usually  impractical*  even  for  problems  of  moderate  size  and 
large  scale  computers,  (see  Bradt  and  Karlin  (1956).  Raiffa 
(1961)*  and  Fu  (1968)  p.  67).  Another  drawback  of  a  dynamic' 
progressing  solution  is  due  to  the  fact  that  the  sequence  of 
testing  generated  by  the  algorithm  is  often  not  natural.  That 
is*  although  mathematical ly  this  sequence  is  the  best,  it  is 
difficult  for  a  human  decision  maker  to  see  the  reasoning  behind 
it.  This*  of  course*  causes  some  reluctancy  to  use  that 
solution. 

One  often  used  method  to  avoid  the  difficulties  inherent  in 
a  dynamic  programming  solution  is  to  use  suboptimal  myopic 
policies*  i. e. •  policies  which  look  only  one  or  a  few  steps 
.ahead.  By  this  approach*  a  stopping  decision  is  reached  when 
the  current  probability  of  error  is  less  than  a  predetermined 
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tolerance  value.  If  this  stopping  rule  is  not  satisfied^  the 
next  feature  is  selected  for  testing  according  to  a  rule  which 
optimizes  an  objective  function  for  a  one  step  look  ahead. 
Assuming  that  the  cost  of  testing  for  all  the  features  is  the 
same>  this  objective  function  represents>  in  fact>  the 
information  gain  expected  from  the  various  features. 

A  frequently  used  feature  evaluation  function  is  derived 
from  Shannon's  entropy  by  which  a  feature  X  is  preferred  to  Y  if 
the  expected  posterior  uncertainty  resulting  from  X: 


H(X)  ■  E  -  [ -  E  a, (x)  log  a,(x)]. 

I  1  ‘ 


is  lower  than  that  for  Y.  In  (3)  and  throughout  this  paper  E  is 
from  1  to  m  and  expectation  is  taken  with  respect  to  the  mixed 
distribution  of  X 


P(x)  -  E  a j P| (x) 


Alternatively »  the  features  may  be  ranked  by  their  expected 
probability  of  error 


P(X)  »  Ef  1  -  max{a}(X) a { ( X) J] 


Table  2  shows  the  feature  ranking  induced  by  the  H 
f unctionsXand  the  P  evaluation. for  the  problem  presented  in 


Example  1.  From  this  table  we  learn  that  by  both  functions 
is  the  most  promising  feature  for  the  next  stage. 

Ben-Bassat  (1978)  explores  the  efficiency  of  thirteen  (13) 
feature  evaluation  functions  in  a  myopic  strategy  for  solving 
Bayesian  pattern  classification  problems  with  conditionally 
independent  binary  features.  Using  an  extensive  set  of 
experiments  he  demonstrates  that  none  of  these  functions  is 
consistently  superior  to  the  others.  On  the  average^  they  all 
reach  a  final  classification  in  about  the  same  number  of  steps* 
although  the  sequence  of  features  may  be  somewhat  different  for 
different  strategies. 

Myopic  strategies  seem  to  be  closer  to  strategies  used  by 
human  decision  makers.  (Humans  have  to  adopt  this  approach 
simply  because  human  limitations*  in  terms  of  computational  and 
memory  resources*  do  not  leave  them  without  any  better  choice.  ) 
See  Teeni  et  al. (1982)  for  literature  and  evidence  confirming 
this  claim.  Nevertheless*  the  sequence  of  testing  generated  by 
the  myopic  strategies  is  described  above  occasionally  does  not 
correspond  well  to  a  sequence  which  would  be  generated  by  a 
human  decision  maker.  The  reasons  are  identified  and  analyzed  in 
the  next  sections  where  feature  selection  strategies  which  are 
based  on  mathematical  functions  will  be  referred  to  as  analutic 
as  opposed  to  behavioral  strategies*  which  refer  to  strategies 
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3.  HUMAN  ORIENTED  STRATEGIES 

Behavioral  strategies  differ  from  the  analytic  myopic 
strategy  described  above  -to  be  named  henceforth  strategy  0-  in 
two  key  aspects: 

1)  While  strategy  0  always  considers  all  of  the  m  possible 
classes<  human  decision  makers  tend  to  limit  themselves  to 
a  subset  of  +he  classes  and  to  select  features  oriented 
toward  this  subset  only.  The  subset  on  which  a  decision 
maker  focuses  attention  depends  on  his  personal  style>  as 
well  as  on  the  specific  stage  of  the  classif ication 
process.  For  example*  in  advanced  stages  DM  may  constrain 
his  view  to  the  current  most  probable  classes  and  look  for 
features  which  contribute  mainly  to  their  recognition. 
His  objective  is  to  obtain  the  final  piece  of  evidence 
which  is  required  to  verify  that  the  true  class  is  indeed 
one  the  current  most  probable  classes.  In  early  stagesi 
his  objective  may  be  to  select  features  which  are 
directed  at  the  elimination  of  alternatives  with  low 
probability  so  that  they  do  not  "bother"  him  in  the  next 
stages. 

2)  At  a  given  stage*  strategy  0  ignores  altogether  the 
history  of  the  process  since  its  Feature  evaluation 
function  considers  the  current  class  probabilities  only 
(and  the  expected  posterior  probabilities).  Human 
decision  makers  typically  employ  considerations  related  to 
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the  class  probabilities  in  earlier  stages.  Assume*  for 
instance*  that  in  the  above  example  features  and 
observed  to  be  positive.  This  changes  the  class 

probabilities  from  (0.30*  0.25.  0.15.  0.15.  0.07*  0.07* 

0.01)  to  (0.11*  0.22.  0.33*  0.003*  0.26*  0.00*  0.05). 
Considering  only  the  posterior  class  probabilities  to 
determine  the  next  best  feature*  overlook  the  fact  that 
the  probability  of  class  5  has  increased  markedly  from 
0.07  to  0.26.  Typically*  such  a  significant  change*  in 
class  probabilities  triggers  a  human  decision  maker  to 
invest  in  features  aimed  at  the  verification/elimination 
of  this  class. 

In  what  follows  we  devise  and  investigate  several 
analytic  strategies  which  incorporate  human  heuristics. 

a*T«t»811  Dynamic  Subset 

At  each  stage  a  subset  of  classes  S  is  selected  according 
to  the  following  procedure: 

Step  1  Rank  the  classes  by  their  triggering  ratio  T  defined  as 

Tj  »  ajn^  /  aj^  If  for  the  most  triggered  class  g 

T  >6  (e.  g.  0  «  2.  0)  and  a^>  F  (e.  g.  F  •  0.  0S>*  then 

9  9 

include  class  g  in  S.  Otherwise*  S  remains  empty. 

Step  2  Take  into  S  every  class  i  for  which  •jn^  >  L  (e. g.  for 
L  ■  0.  30  there  are  three  at  the  most). 

Step  .3  If  8  contains  less  than  two  classes*  reduce  the  value 
of  L.  by  0.05  and  go  back  to  Step  2. 


One*  a  subset  S  has  been  established*  the  probabilities  of 


the 

S  are  normalized 

to 

sum  up 

to  1*  i.  e. 

ak  ”*■  ai/  1  at 

*  K  i€S 

for 

. k  €  S.  The 

next  feature 

to 

be 

tested  is 

determined 

by 

applying 

strategy  0 

to 

the 

reduced 

Bayesian 

classif ication  problem  defined  bg  S  and  the  non-tested  features. 
Namely*  a  feature  evaluation  function  such  as  H  ranks  the 
features  which  have  not  yet  been  tested*  and  the  top  ranked 
feature  is  selected. 

Strateou  2:  Stable  Subset 

The  mechanism  of  the  dynamic  subset  is  carried  over  with 
one  key  difference:  once  a  subset  S  is  selected  w*  continue 
exploring  it  as  long  as  there  is  a  good  reason  to  believe 
that  the  true  class  is  within  the  current  subset. 
Mathematical ly*  at  stag*  n  we  test  whether 

E  afn)  >  D  (6) 
i  S  ' 

where  0  is  a  predetermined  value*  say  0  »  0.  5.  If  the  sum  of 
the  current  probabilities  exceeds  D*  the  subset  is  maintained 
If  the  sum  falls  short*  a  new  subset  is  generated  a*  described 
in  strategy  1. 

Strateou  3:  Most  Probable  Class  (MPCL) 

A  subset  of  two  classes  is  generated*  on*  is  the  most 
probable  class  <MPCL>*  the  other  is  the  union  of  the  rest  of  the 
classes  considered  as  a  collective  alternative  to  MPCL.  The 
subset  remains  unchanged  as  long  as  the  MPCL  remain*  the  same. 


The  feature  selection  rule  is  as  in  strategy  0  applied  to  the 
classification  problem  defined  as  follows. 


Assuming  that  the  MPCL  is  k»  the  class  probabilities 

defined  as  follows: 


and  the  conditional  probabilities  are 
A’ 


(V  1  '  ak} 

PH  "  P«J 

P21  “  (  ||kalPlj)  /  ‘  ak^ 


for  all  J 


(7) 


Strateou  »:  Most  Triggered  Class  (MTCL)  vs.  the  Rest 

This  strategy  is  the  same  as  strategy  3.  except  that  the 

selected  clas%  is  determined  as  the  most  triggered  class  (MTCL1. 

provided  that  T  >  G  and  a^>  F.  Otherwise  MPCL  is  selected. 

9  9 


Comment  The  notion  of  one  against  the  rest  employed 

in  strategies  3  and  4  has  also  been  considered  by 

Kanal  and  Kulkarni  (  ). 

Table  3  summarizes  the  parameters  used  in  the  various 

strategies. 

fcfiWlt 

Table  4  illustrates  the  four  strategies  *  for  the 

classification  problem  of  example  1.  The  process  stopping 

threshold  V  is  set  at  0.  85. 


5.  EXPERIMENTS 

A  simulation  computer  program  was  written  to  evaluate  and 
compare  the  various  myopic  feature  selection  strategies.  For 
given  problem  dimensions  and  a  myopic  feature  selection 
strategy*  the  program  flow  consints  of  four  main  loops  as  sh-own 
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Tabic  3:  Li*t  of  Parameters  and  Their  Values  in  the  Experiments 


•Value 

• 

• 

•Parameter 

1 

l 

•  Definition  ! 

:  o.  s 

i 

i 

• 

• 

i 

i 

• 

!  D 

• 

• 

1 

t 

■ 

• 

1 

•A  threshold  to  determine  whether  a  given! 

isubset  is  likely  to  include  the  true! 

iclass.  ! 

■(Defined  in  strategy  2)  ! 

»  1 

•  0.05 

1 

1 

1 

1 

1 

!  F 

1 

• 

• 

1 

1 

! A  minimal  value  for  class  probability  to! 
ibe  eligible  for  inclusion  in  a  subset.  ! 
{(Defined  in  strategy  1)  ! 

!  2.  00 

• 

• 

1 

!  G 

5 

1 

!A  threshold  for  the  triggering  ratio.  ! 

((Defined  in  strategy  1)  ! 

!  ! 

! A  lower  bound  for  class  probability  above! 
which  the  class  is  included  in  the! 

{selected  subset  regardless  of  its! 

{probability  in  earlier  stages.  ! 

!  0.  30  !  L 

! (or  less)  ! 

:  ; 

!  ! 

:  0.85 


! (Defined  in  strategy  1)  I 

i 

!  A  threshold  used  to  terminate  feature! 
{acquisition  once  the  probability  of  a! 
iclass  is  above  V. 

_ : _ i 


\  Stage 
STRATEGY  \ 


Subset 
Selected  Feature 
MPCL 

MPCL  Probability 


Subset 
Selected  Feature 
MPCL 

MPCL  Probability 


Subset 
Selected  Feature 


Selected  Class 
Selected  Feature 
MPCL 

MPCL  Probability 


All 

All 

All 

All 

All 

3 

1 

6 

2 

4 

2 

5 

5 

5 

5 

.  26 

.  51 

.  75 

.  46 

.  62 

1,2, 7, 4 

2,7 

4.  5 

4,  5 

3 

6 

1 

4 

2 

5 

5 

5 

.  26 

.  39 

.  75 

.  86 

1. 2,  3.  4 

1 

1#  2#  3»  4 

1,2,  3,4 

1 1  2#  3i  4 

1.2.  3.4 

3 

4 

2 

i 

6 

1 

2 

5 

2 

5 

3 

1 

2 

6 

5 

2 

5 

2 

5 

5 

.  26 

.  51 

.  31 

.  47 

.  62 

in  Figure  1.  In  the  outer  loop  IV  a  probability  matrix  P  and  a 
prior  probability  vector  are  either  randomly  generated  or  read 
in  from  an  input  device.  In  loop  III  we  specify  the  stopping 


threshold  for  terminating  feature  selection  and  classifying 
the  object  to  the  most  probable  class.  At  this  the  end  of 
loop  III  we  have  a  complete  definition  of  a  Bayesian 
c lassi f i cati on  problem. 

In  loop  II  a  set  of  cases  to  be  classified  is  generated  by 
the  following  procedure.  First*  a  random  class  i  is  selected 
according  to  the  prior  probabilities*  and  then  an  n  dimensional 
0—1  record  is  generated  according  to 
representing  a  possible  pattern  from  class  i. 

In  loop  I  a  feature  evaluation  function  is  specified  such 
as  Shannon's  entropy  or  the  Probability  of  Error.  Fourteen  such 
functions  may  be  selected  (See  Ben-Bassat  (1978)). 

For  a  given  case  the  program  proceeds  as  follows.  By  the 
myopic  strategy  and  the  feature  evaluation  function  it  ranks  and 
selectes  a  feature  to  be  tested.  The  result  of  this  feature 
(either  negative  <0)  or  positive  <D)  is  retrieved  from  the 
record  of  the  case  under  consideration*  and  the 
posterior  probabilities  are  calculated.  If  the  stopping  rule  is 
not  satisfied*  we  go  back  to  feature  evaluation  and  selection. 
Otherwise  we  stop  testing  and  classify  the  case  to  the  most 
probable  class.  The  output  consists  of  a  detailed  description 
of  the  c lassif ication  process  for  each  case  and  a  summary  data 
as  shown  in  Table  9.  These  data  may  be  retained  on  a  storage 
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Tab  1 


record  1 


record  5 


Number  of  features  tested. 

Probability  of  error  (P  )  when  process  stopp 
Difference  between  initial  and  final  P  . 
Difference  between  initial  and  final  entropy 
Number  of  active  classes  when  the  process  st 
Number  of  times  the  Trigger  mechanism  was  cm 
if  any. 

Power  index  of  descrimination. 


Sequence  of  chosen  features. 


vN  -  -  *.  *. 
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device  for  further  statistical  analysis. 

It  should  be  noted  that  using  the  same  seed  for  the 
simulation  tasks  use  are  able  to  generate  exactly  the  same 
classification  problem  and  cases  so  that  comparisons  betuseen 
strategies  are  made  under  the  same  conditions. 

Using  this  program  use  are  experimenting  mith  the  various 
strategies  in  an  attempt  to  learn  their  efficiency  and 
characteristics. 
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